DOCOHEHT BESOME 



ED 130 229 



CS 002 956 



AUTHOE 
TITLE 



INSTITUTION 
SPONS AGENCY 
PUB DATE 
CONTRACT. 
NOTE 

EDRS PRICE 
DESCRIPTORS 



Palmer^ Francis H. 

The Effects of Minimal Early Intervention on 
Subsequent IQ Scores and Reading Achievement, Final 
Report. 

State Oniv. of New York, Stony Brook. 

Education Commission of the Sta tes, Denver, Colo • 

76 

13-76-06846 
63p. 

MF-$0.83 HC-$3.50 Plus Postage. 

Early Childhood Education; Elementary Education; 
Followup Studies; *Intelligence Quotient; 
♦Intervention; Lower Class Males; Negro Youth; 
♦Preschool Evaluation; Preschool Learning; *Preschool 
Programs; Program Evaluation; *Reading Achievement; 
♦Success Factors 



ABSTRACT 

IQ and reading achievement in grade five were 
examined in a ten-year follow-up study of children who had 
participated in an early-intervention program, at ages 24 or 36 
months* The intervention program varied age of training, type of 
training (concept versus discovery) , and social class for 310 black 
male children from Harlem. The follow-up study obtained WISC scores 
for 139 and reading scores for 117 of the original sample. Analyses 
indicated these were representative of the original experimental and 
control samples. Comparison groups not involved in the original study 
were also drawn. Results indicated that concept training at age 24 
months or 36 months significantly affected reading in the fifth grade 
and IQ at ages 10 to 12. Intervention at age two had an effect on 
reading and IQ, whereas intervention at age three affected IQ but not 
reading. Discovery training affected IQ but did not affect reading. 
Implications of the findings for general evaluations of the success 
or failure of Headstart and other early-intervention programs are 
discussed. (AA) 



************************* *************** 

* Documents acquired by ERIC include many informal unpublished * 

* materials not available from other sources, ERIC makes every effort * 

* to obtain the best copy available. Nevertheless^ items of marginal * 

* reproducibility are often encountered and this affects the quality * 

* of the microfiche and hardcopy reproductions ERIC makes available * 

* via the ERIC Document Reproduction Service (EDRS) . EDRS is not * 

* responsible for the quality of the original document. Reproductions * 

* supplied by EDRS are the best that can be made from the original. * 
*********************************************************************** 



EKLC 



^. u s. DEPARTMENTCF HEALTH. 

EDUCATIONS WELFARE 
NATIONAL INSTITUTE OF 
EDUCATION 

» 

THIS DOCUMENT HAS BEEN REPRO- 
DUCED EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN- 
ATING IT POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRE* 
jMi^ SENT OFFICIAL NATIONAL INSTITUTE OF 

EDUCATION POSITION OR POLICY 

r\J 
o 

FINAL REPORT 

o 



THE EFFECTS OF MINIMAL EARLY INTERVENTION ON 
SUBSEQUENT IQ SCORES AND READING ACHIEVEMENT 



By 

Francis H. Palmer 
State University of New York, Stony Brook 



Follow-up Research done in Contract (l3-76-068U6) with the Education Conmission 
of the States over the time period October 1, 1975 to August 31, 1976'. 



FORL'WORD 

August 22, 1976 

There have been many critics of the effects of early intervention. 
Critics who claimed that Black children could not be affected because 
they are genetically inferior and xmequipped to benefit from the public 
schools as their middle class peers do. Critics who claimed that the 
n\jmber of hours a child is exposed to intervention is insufficient to counter- 
act the devastating effects of the ghetto family and community. « 

These data suggest that those critics are wrong. 

When the funds were made available by ECS for this follow-up study, 
this investigator did not hope that the results would be as convincing 
as he now thinks they are. The study shows an intervention at a specific 
period of time had a clear and meaningful effect on the subsequent scholastic 
achievement of both middle and lower-class Black boys. 

Furthermore, this study provides information about two types of • 
training, and two times when training was introduced, and the effects 
those variables have on subsequent school performance. This investigator 
is too close to the data at this time to speculate on how these results 
shall ultimately modify the consensus of belief about age and type of 
training - but the data are clearly relevant to those compelling subjects. 

If one wants to know how early childhood intervention affects scholastic 
performance at age 12, someone must support a follow-up to assess children 
at age 12, ECS provided that support. 

Surely, having the opportunity to obtain these results is more 
personally rewarding than had I invented the transistor. 



3 



THE EFFECTS OF EARLY IMTER VENTU. ON ON IQ AND READING ACHIEVEMENT 

I. PURPOSE * 

The purpose of this study was to locate and assess in the fifth 
grade as many of the original 310 children in the Harlem Research Center 
early intervention program as possitile. That study, initiated in 1966 
with a grant from the National Institutes of Health, provided early in- 
tellective training for 2kO two and three-year-old Black male children, 
as well as annual assessment pf 70 control childi-en who did not participat 
in the training prograjn. At grade 5, achievement is defined as the IQ 
derived from the Wechsler Intelligence Scale for Children (WISC), and the 
reading achievement scores obtained from the school each child attended, 

II. BACKGROUND 

A, Research Design. 

The study varied age of training, type of training, and social class 
The age at which training began in 1966 was 2k months (T-2) and 

in 1967 36 months (T-3). 

At each age, 120 children served as subjects. (T-2=120; T- 3=120) 

Type of training consisted of two conditions: Concept Training (CT) 
and Discovery Training (DT). Both groups attended the some facility twice 
^ weekly for one hour over a period of eight months. Both were exposed to 
one-to-one instruction, the same instructors, and identical training 
materials. Instructors were rotated every six sessions. No child had the 
same instructor more than six. Conditions were identical except for what 
occTirred during the training session. The CT Group received a structured 
curriculum designed to teach concepts such as big-little, rough-smooth, 
wet-dry, loud-not loud, next to-far away, etc. a^ose concepts were taught 



Palmer * Page '2 

in a fixed procedure involving four steps with a specified criteria for 
knowing each concept. The instructor planned for each session, predeter- 
mining what concepts would he taught, and with what materials the concept 
would be taught. He guided the J:raining, initiated most conversations, 
and recorded the results of each session so that he or subsequent instructors 
knew what had occurred to the child previously. I'he DT group had no curric- 
ulxim nor any structured training. Instructors did not initiate conversations, 
but responded to questions and gestures the child made. Instructors were 
trained to not emphasize the concepts being taught in the CT group. In 
maxxy respects, the training was passive; not unlike what occTirs in many 
child care centers except that the one-to-one situation prevailed at all 
times. Detailed descriptions of CT and DT conditions have been documented 
before. (Palmer, 1971 ) 

Studies were conducted during the training period to ascertain that 
instructors differentiated between their CT and DT roles . Those ^studies 
showed, for example, that the DT training involved only 12% as much conver- 
sation with the child as the CT training did. Thus, it was shown that the 
instructors provided the two types of training specified by the research 
design. 

SociaJ. Class was defined by the education and occupation of the 
parents,. and measured by the Hollingshead-Redlich Two Factor Scale of 
Social Class, slightly modified. Those children designated Lower Class 
(LC) in this study comprised Category V of the Hollingshead-Redlich. 
Those children designated Middle Class (MC) were of parents who were cate- 
gorized IV to II on that measure. 



5 



Palmer 

Page 3 

The original sample was representative of the larger Harlem popula- 
tion of 1966.- Of those ultimately classified as lower class (LC) 38 were 
semi-skilled production workex-s and the remainder were unskilled, ui^employed, 
or on welfare. Twenty-three percent (23%) had completed high school, and 
3k% had never attended high school. The MC parents were comprised of 15% 

in the Hollingshead Category IV clerical personnel, skilled laborers, 

machine operators, etc.; 23% were in Categories III, II, and I executives, 

adDiinistrators , business managers, etc. Eighty-four percent iQk%) of MC 
had graduated from high school, and twelve percent (l2$?) of that subset 
had graduated from college. 

B. Subject Selection 

The 310 subjects in the original sample were selected from 1,500 
birth records of black male children born in the Harlem and Sydenham 
Hospitals in Manhattan between August and December of 1961+ . Each lived 
between 100th and ll+5th Streets in Manhattan at that time. The information 
on the 1,500 birth records was sent to the U. S. .Post Office,' who confirmed 
900 current addresses from which TOO home interviews were attempted and 
500 were achieved. From that 500, the subject pool was established with 
those children who met the following criteria: over five pounds at birth, 
mother with no history of syphilis or drug addiction, both parents self- 
described as Negro, both parents spoke English as a first language, and no 
serious illnesses between birth and 2k months of age. The 310 subjects 
were drawn from that pool to meet a predetermined distribution by social 
class. That distribution specified more lower class in each relevant cell 
than middle .class because it was assumed attrition would be greatest for 
the former.' g 



Palmer ....... Pa/^e U 

Analysis of the selection process showed no bias across those assigned 
to the several cells of the design at either age of training. 

The percentages located, interviewed, and if interviewed, the pro- 
portion who rejected the program did not differ by group after assignment 
was made. 

Ultimate assignment to cells of the design was made on the basis of 
an interaction between MC and LC, and on pre-trainiug assessment using the 
Concept Familiarity Index (CFl), a measure of alr';::ad:r" existing knowledge 
about the same concepts included in the CT curriculum. CT, DT and Control, 
by age of training, did not differ on means and standard on the CFI, nor 
on social, class. 

C. The Training Experience 

The T-2 group was trained in I966, the T-3 group in 196?. The training 
facility included eight 9' x f' rooms with one-way mirrors and audio equip- 
ment which provided for observation. Parents and children were provided 
transportation, if desired. Each child had regular appointments every week 
for two one-hour sessions, staggered with at least one day intervening the 
last. Parents were encouraged to attend at least the first six sessions^ 
to be present while the child adapted to the Center, and to provide the 
opportunity for them to see and hear what activities their child would 
be engaged in. After the first six sessions, parents co\ild attend when 
they desired, and if anything occurred in the training session they wanted 
explained, they were free to enquire of instructors and other staff personnel. 

No parent observed CT training if their child was given DT training, 
and vice versa. None were remimerated except in the form of transportation. 



7 



Palmer Page 5 

Over the two years of training 82% of all appointments were met. 
Each child that completed training had at least ko sessions, and the average 
child received h^. In addition, the average time required for pre-training 
assessment was eight hours for T-2 and six for The average post- 

training assessment time for training groups of both ages was six. Of 
the 120 two-year-olds who hegon training in 1966 (T-2) 100 completed 
training eight months later. Of the 120 three-year-olds who began in 196? , 
Ilk completed the program. 

D. Results of Preschool Assessments 

Participating children (T-2 and T-3) and the controls were assessed 

j 

when training ended and annually thereafter until they were four years, 
and eight months old. Consequently, those trained at two and the controls 
were assessed at 2/8, 3/8 and 1+/8; and those trained at three were assessed 
at 3/8 and U/8. 

Although T-2, T-3 and the Controls were randomly assigned from the 
subject pool, a difference of 5 points on the Stanford Binet IQ was found 
between the controls at 2/8 and the T-3 group in post training assessment 
at age 3/0. For this reason, subsequent analyses were performed on the 
rdata with covoriate techniques using IQ at first assessment as the covariate. 

Assessment at 2/8 showed that T.2 outperformed the Controls on ik 
of 17 behavioral measures. No differences existed between CT and DT groups. 

Assessment at 3/8, when T-3 had just completed training and T-2 had 
completed one year before, showed that both groups were superior to the 
controls on most measures in the ik test battery administered. No dif- 
ferences existed between CT and DT. 

At age l|/8, most of the differences previously found between T-2 

8 



Palmer Pa^':e 6 

and T-3 and their controls disiappeared. A statistically significant dif- 
ference remained for performance on the battery as a whole, across measures, 
but only one individual measure differentiated those who had been trained 
and the Controls. Again, no differences existed between CT and DT groups. 

With respect to performance as a function of the social class of 
the subject, differences between LC and MC did not exist consistently at 
2/8 and 3/8, but at h/8 they were clearly established (Palmer, 1970). 

The results of the early assessments indicated that early interven- 
tion made immediate effects which were undifferentiated as far as age of 
training and type of training were concerned, but that effects were barely 
distinguishable at age k/6. However, measures taken at h/Q are not, cannot 
be, measure's of scholastic achievement. One cannot measure scholastic 
achievement until children attend school. Furthermore, measures of scholastic 
achievement in the first two grades tend to be xmreliable. No reading 
scores are. recorded in the NYC public schools in grade 1, and those recorded 
for grade 2 have a built-in floor effect which makes the value of the 
measure questionable. Beginning at grade 3, although the floor effect 
persists, measures of reading are reliable indices of the child's perfor- 
mance, and tend to become more reliable with age. 

The question to be answered by this follow-up study of the Harlem 
sample is: What are the effects of age of training, type of training 
tad social class on scholastic achievement? Scholastic achievement has 
been defined as reading in the third, fourth and^ fifth grades, and by IQ. 

III. SUBJECTS LOCATED AND ASSESSED IN 1975 
A. Subjects Found 



9 



Palmer p^^^ j 

Two hundred ninety two children of the original 310 completed train- 
ing in 1967 and 1968 and were assessed, or were assessed as controls in 
I96T. That figure (N-292) represents the maximum possible subject pool 
for the follow-up study. One hundred thirty nine {hQ%) of that pool have 
been located and assessed on the Wechsler IQ as of August 22, 1976. One 
hundred seventeen (llT) reading scores for the finh grade (1975) were 
obtained by then. More IQ's than reading scores are available because a 
subset of those administered the WISC were absent when the reading teats 
were adininistered in the schools, or because the reading scores have not 
yet been found. (Most New^York City elementary schools were closed during 
the summer of 1976.) 

The number of children in each cell of the design at the time of the 
first assessment, the number found for assessment in 1976, and the per- 
centage' found of the maximum possible, is shown in Table 1. 

For main effects, the following percentages of subjects were assessed: 
'B-2=5h%, T-3=51J?, Control- kO%, CT=:li7;2, DT=53??, LC=k9% , and MC=-j2%. Ex- 
cept for the controls, each cell is represented by about the same percentage 
found; remarkable since no particular attempt was made to search for sub- 
jects in any particular cell. 

As of this writing, the study has been funded for another year of 
follow-up. At the end of next year, it is reasonable to assume that the 
total subjects found will approach 200, a figure better than original 
estimates of attrition, made when the study was (iesigned. 

When the study was designed the subjects per cell desired for multi- 
variate analysis was 15. More LC than MC were included in the original 

and more controls than experimentals because disproportionate. 

10 



Palmer Page 8 

attrition vas anticipated among LC and controls. For the controls, that 
estimate was well founded and of the original 68, 27 are found, three 
lei3S than the design anticipated. LC, however, has remained in the pro- 
gram more than expected as compared to MC {k9% vs, 52^), leaving the most 
serious cell deficiencies for mviltivariate analysis in MC x T-2 x (CTADT). 

Despite the accui^acy of the original estimates for attrition, sub- 
sequent years of follow-up should locate every subject possible. Our 
present estimate of subjects which will be located eventually, is 6^% - 
an increase of 50 subjects over initial projections. 

For the present analysis, however, some cell sizes are inadequate 
for multivariate analysis. Thus, the present analysis is limited to 
main effects, using Chi Square and t-ratio statistics. 

Are the subjects found representative of the original sample? Are 
they representative of Harlem fifth grade boys in 1975? 

We conclude that the subjects found are representative of the orig- 
inal sample. Within the original LC sample, hU% of the Controls, kl% of 
the T-2*s, and* kl% of the T-3's have been located and assessed. Within 
the original MC sample, 53% of the Controls, h6% of the T-2's and 60% of 
the T-3's have been located and assessed. Slightly more MC's than LC's, 
and slightly more T-S's than T-2's have been assessed, but those proportions 
are not sufficient to bias the analysis. Within the MC sample, more of 
the group as a whole has been found but the dropouts ^have been dispropor- 
tionately in Categories I, II, end III. For whatever reason, only 12 of 
an original 37 (32^) originally classified in those advantaged groups have 
been located. 

The 139 subjects assessed in 1976 include 66 originally classified 



EKLC 



11 



I 



Palmer 



a 

CJ 



GO 



04 



Page 9 



o 



O 





VD 




t-- 




C7N 




H 


CO 




CO 


§ 


o 


Fo 


o 


CO 


Q 
H 


\o 
1 


S 





CO 
CO 

CO 

w 



CVJ 



CM 



lA 



CO 
CM 



lA 
CM 



CM 

no 



9 



e 

5 
o 

CO 
EH 

o 



CO 



a 
o 



CM 
lA 



lA 
J- 



o 



CO 
lA 



VO 

m 



vo 
J- 



CM 



CO 



o 

no 



\o H 



CM 
I 

EH EH 

^ o 

CM 

a 

05 

EH 



no 
I 

EH EH 

^ o 

no 

-p 
< 

a 
fid 



tH 

o 



EH 



CO 

rH 
O 

-P 

a 
o 
o 



ERLC 



12 



in category V (Vf^), 6l classified in Category VJ ' {Uk%) and 12 in Cate- 
gories I, II, III (9%). That balance is believed to be representative 
of the fifth grade boys in Harlem in 1975, 

B. Attrition 

Are the subjects found for this analysis representative of the orig- 
inal sample on Intelligence Scores and Social Class? 

To determine whether or not the sample found is biased, those fourul 
were compared with those not found on the last '^ropriate assessment with 
the Wechsler Preschool Scale (WPPSI) and/or the Stanford Binet, The re- 
sults of that analysis show that the Controls found did not differ from 
those not found, T-.2 found did not differ from T-2 not found, T-3 found 
did not differ from T-3 not found, CT found did not differ from CT not found, 
DT found did not differ from DT not found • We conclude that the sample 
available for this analysis is not biased with respect to previous scores 
on intelligence tests. 

The same analysis by group was conducted for Social Class. Original 
scores on the Hollingshead-Redlich were compared with t-ratios between 
the sample found and not found. Appendix A shows no significant difference 
between those found and not found by group in the research design and by 
Social Class. We conclude that no bias exists with respect to the Social 
Class of those children found and not found. 

C. The Addition of a Comparison Group. 

Fi'equently, when longitudinal studies include a control group, the 
controls as well as the treatment groups show sn effect. This phenomena 
has been attributed to two factors: (a) the fact that the control group 



Palmer Page 11 

is participating in a study, the Hawthorne effect, and (b) the control 
group is assessed at the same intervals as the treated groups, providing 
them with a degree of test wiseness not found in naive samples. 

With respect to the Hawthorne effect, we have only anecdotal evi- 
dence. Conversations with the Control mothers diiring the 1976 assessment 
indicated that because their children had been assessed annually when they 
were 2/8, 3/8, and 1+/8; and again in 1976, some of those mothers considered 
their children a part of the Harlem program as much as mothers whose children 
who were trained did. 

With respect to test wiseness, the Control group had on the average, 
20 hours of experience with examiners and tests at ages 2/8, 3/8 and U/8; 
presumably twenty hours more than children not involved in the program. 

For those reasons, it was considered essential to obtain another 
Control group (a comparison Group) who had no previous relationship with 
the study. Ideally, that group would be drawn from the same popvilation 
as those who had participated in the program. 

To date, this has been possible for reading scores only. To obtain 
a comparison group for IQ, a selected sample wo^iLd have to be identified, 
located, and assessed on the WISC - too expensive for our budget. 

For the results reported below, the Comparison Group has been derived 
in two ways, the former more precise than the latter. 

The Comparison Group for the fifth grade reading scores was obtained 
in the following manner: , 

(1) The public schools (N=15) in Manhattan, whose students were 
90% or more Black were identified. 

(2) The number of children in our sample and asf^essed in 1976, at- 

14 



Palmer Page 1?? 

tending each of those 15 schools was tabulated. 

(3) The percentage of children reading at grade level or better in 
each of those schools was obtained. 

(U) A weighted score was derived for each school by multiplying 
the n\amber of our subjects in that school by the schools percentage reading 
at grade level , and a mean percentage reading at grade level or above was 
calculated for the weighted distribution of school scores; 

(5) It was ascertained that the weighted measure was identical 

to the percentage in the school district which 50% of our sample attended 
(District 5, Majihattan). 

(6) The assumption was made that District 5, which as a District is 
over 90/{ Black, was most representative of the children in oTir sample. 

(7) A random sample of 5th grade reading scores was obtained from 
different schools and classes in that district, 100 boys and 100 girls. 
Girls read significantly better thaji boys. 

(8) From 2,060 District 5 reading scores in the 5th grade (l975), 
352 scores were obtained for boys only. Only those names for which there 
wa!3 no question of gender were used. 

(9) The resulting distribution of scores for 852 boys, 90^ of whom 
are presumably Black, wels used in bhe analysis as that distribution most 
representative of our sample. We refer to that group as the Comparison 
Group for Grade 5 Reading results. . 

The Comparison Groups for Grade three and four are not so precisely 
derived at this time. The Comparison Group reported for those grades is 
derived from the distributions of scores representing both boys and girls 
in District 5 in 19T5. Since the average boy is four months (5*0 vs. 5-^) 



15 



Palmer Pa^re 13 

behind the reading level of the average girl in the fifth grade , we as- 
sume that the Comparison Group distribution used at grades three and four 
is a conservative estimate of the differences between our sample, all boys, 
and the sample used as representative of their peers. 

IV. READING RESULTS 
Reading results from the 1976 assessment include scores obtained 
when the subject population was in the third grade (l9T3) the fourth grade 
(19TM and tho fifth grade (l9T5). They are presented below as they are 
a function of age of training, type of training, and social class - the 
variables manipulated in the research design - and by grade. 

The results obtained include scores from over 50 elementary schools 
in the New York City public school system and in 15 schools in the Catholic 
diocese in New York City, as well as nine scores for children who have moved 
out of the City and now live elsewhere. Only scores directly comparable 
to the measures used in the public school system were ied in the analysis. 

In 1975, the New York Public Schools began using the Stanford Achieve- 
ment Test (SAT) to assess the achievement of all pupils. In 1973 .and 191k, 
and for several years previous to that, the Metropolitan Achievement Test 
(MAT) was used. So far as can be determined, the reasons for shifting 
measures were two: (l) the norms for the SAT were more representative 
of the children in the NYC public schools; that is, on the average their 
students would score higher on the SAT than the MAT, and (2) it was sus- 
pected that coaching for the MAT was rampant - the MAT had been used long 
enough so that items on that test were available. For the second of these 
reasons, the individual scores obtained for grade 5 are probably more valid 

16 



Palmer Page ih 

than for scores at grades foiir and three • Since the SAT was used for the 
first time in 1975, teachers had little opportunity to coach for it. 

The shift from the MAT to the SAT explains the spurt in reading 
scores from grade four to five, as compeared to the change from three to 
four. For Compeurison Groups comprised of "both hoys and girls at each 
grade level in District Five, the average gain from the third grade to the 
fourth was only k.h months, whereas from grade four to grade five, the aver- 
age gain was 1 7 months. Either fifth grade teachers are performing mira- 
cles or the SAT and lAAT norms differ. 

There are some 15 scores in the distribution obtained from those 
Catholic schools where the Stanford Reading Achievement Test (SRA) was used. 
Where students in the Catholic schools, private schools or out of New York 
schools vere assessed on reading measures other than the MAT, SAT, or SRA, 
those scores are not included in the analysis. Only four or five such 
scores were obtained. SRA scores at the fifth grade level were considered 
equal to SAT scores after discussions with the respective research depart- 
ments of the publishers concerned. The 15 SRA scores included with the SAT 
scores from the public school system are proportionately distributed be- 
tween the several cells of the research design. 

Two indices of reading achievement are reported in the results: 
the percentage of children reading at grade level or betLer, and the aver- 
age reading score. Analyses related to the former are by Chi Square; 
those related to the latter by a two-tailed t-ratio. The original hypo- 
theses of the study predicted the CT group would outperform the DT group, 
and the MC group would outperform the LC group, and that both would out- 
perform the controls. The two-tailed test of significance was also used 

o 17 

ERIC 



Palmer p^gg ,5 

in the analysis for age of training, for which no original hypothesis 
,. was made with respect to direction. This is recognized as a flaw in the 

analysis as it exists at this time. 
, , An averaige reading score of 5.1 indicates that the group reading 

achievement is five years and one month as compared to the national norm 

of, in the case of the SAT fifth grade results, 5-7. Only the ten months 

of the school year are reflected in the scores; that is, the next higher 
' score to 5.9 is 6.0, not 5.10. 

Chi square and t-ratio analyses ai-e detailed in Appendix B. Percent 

above average or better refers to the percent reading above the national 

aver age on the SAT or MAT. 

A. Reading as a Function of Age of Training 

Table II shows the reading level of the subjects who participated 
in the Harlem study, and of their Comparison Group, during the years 1975, 
197I' and 1973 when the modal subject in the sample was in grades five, four 
and three, respectively. For each grade level or above, and for average 
reading score. 

1. Reading and Age of Training: Grade .5 (1975) 

a) Percent at grade level or better. Forty eight per cent of 
T-2, 1+0^ of T-3, of the Controls, and 31^ of the Comparison Group were 
reading at grade level or better in April, 1975, when the measure was given. 
Chi Square analysis revealed that the proportion of children in T-2 reading 
at grade level or better was significantly higher than that proportion of 
the Comparison Group (p < .03). Other, possible comparisons did not reach 
the .05 level of statistical significance. The T-2 Group at 1^8^ is only 



18 



Palmer Paf:e 26 

slightly below the national norm for the SAT (50/?). 

b) Average Reading Score: The average reading scores were 
T-2=5.35, T-3=5.02, Control=5.09 and Coraparison=5.09. None of the t-ratios 
between groups achieved statistical significance although the T-2 vs. Com- 
parison Group approached it. (See Appendix B. ) 

2. Reading and Age of Traing: Grade k {l9jh) 

a) Percent at Grade Level or better. Thirty percent (30^) 

of T-2, 26% of T-3, ^3% of the Controls, and 22^ of the Comparison Group 
read at grade level or better. Chi Square revealed no significant differ- 
ences although T-2 vs. comparison approached the .05 level of confidence 
(Chi Square = 3-73 dfi: critical value=3.8o). 

b) Average reading score: Average reading scores were T-2=i4.06, 
T-3=3.79, Controls=:3.70, and Comparison=3. 79 (both boys and girls). No 
t~ratio between pairs was statistically signif icemt . 

3. Reading and Age of Training: Grade 3 (1973) 

a) Percent at Grade Level or better: forty seven percent (^^7^) 
of T--2, hO% of T-3, 26% of Controls, and 29^ of Comparison were reading at 
grade level or better. Chi Square revealed T-2 to be significantly higher 
than Comparisons. No other pairings reached a satisfactory level of 
confidence but several approached that level. 

b) Average Reading Score: Average reading scores were T-2=3.60, 
T-3=3.**0, Control=3.39, and Comparison=3. 35 . None of the differences were 
statistically significant. 

Discussion: Reading and Age of Training 

The T-2 Group had the highest average reading score in grades 



19 



Palmer 



Page 3.7 



TABLE II 

AGE OF TRAINING AND READING AT AGES 9-11 

12 3k 

Trained At Trained At Control Comparison 

2k Months 36 Months Group Group 

ISIll 5th Grade (SAT) (N = ko) (N =53) = 28) (N = 852) 

a) % Grade Level or k8% ho% 26% 30% 
Better (5-7 Years) 



b) Average Reading 5.35 5.02 5. 09 

Score 



5.01 



197^: ^bh Grade (MA T) (N = kO) (N = U5) (N = 30) (N = 2096) 

a) % Grade Level or 30 25 23 22 
Better 

b) Average Reading U.06 3,79 3,70 3.79 
Score 

1973: 3rd Grade (MAT) (N = 36) (N = k3) (N = 19) (N = 2l6o) 
a.) % Grade Level or 

Better U7 Uo 26 29 

b) Average Reading 3.6O 3.U0 3,39 3.35 

Score 



20 



Palmer Page 10 

3, \y and 5- The probability of that occxiring by chance is .O156. At 
grade three T-2 was 2.5 months ahead of Comparison and 2 months ahead of 
T-3 and Control. At grade h, 3 months ahead of Comparison and Control 
and 1.5 months ahead of T-3. At grade 5 they were 3.5 months aliead of 
T-3, Comparison and Control, 

At each grade a higher proportion of T-2 reads at grade level or 
better. The difference between T-2 and Comparison is statistically sign- 
ificant at (Trades 3 and 5 and misses significance at grade k by .07 from 
the critical value of 3.80. 

While T-3 shows a higher proportion reading at grade level than Control 
or Comparison at each grade level, their average scores are almost identical 
to those two groups at grades 3 and 5- 
B. Reading and Type of Training 

Table III shows reading in the three grades as a function of whether 
subjects received CT or DT training. The Control and Comparison Group 
scores shown there are, of course, the same as shown in Table II. 

1. Reading and Type of Training: Griade 5 (1975) 

a) Percent above grade level: Forty seven percent (l4 7^) of CT, 
37^ of DT, 3S% of Controls, and 30% of Comparisons read at grade level 

or above. CT, was significantly better than Comparison (p=<.02). No 
other comparisons between groups was significant. 

b) Average reading score: The average reading score for CT 
was 5.29, DT was U.97, Control 5.09, and Comparison vfas 5.01. None of the 
t-ratios was significant at the .05 level of confidence. 

2. Reading and Type of Training: Grade k {197U) 

a) Percent above average or better: Twenty nine percent (29^) 



EKLC 



21 



Palmer 



Page 19 



TABLE III 

TYPE OF TRAINING AND READING AT AGES 9--11 

Concept Discovery Control Comparison 

197^: ^th Grade (SAT) (N = 55) (N = 36) (N =28) ^ = 852) 

a) ^ Grade Level or * 37 36 30 
Better 

b) Average Reading 5.29 U.97 5.O9 5.05 
Score 

19 7U: kth Grade (MAT) (N = U7) (N = 37) (N = 28) (N = 2096) 

a) % Grade Level or 29 21 23 22 
Better 

h) Average Reading , -3.95 3.86 3.70 3.79 
Score 

1973: 3rd Grade (MAT) (N = i^5) (N = 29) (N = 19) (N = 2l60) 

a) % Grade Level or 1^3 U3 26 29 
Better 

t) Average Reading 3.60 3.1*0 3.39 3.35 
Score 



Palmer Page ?0 

cf CT, 21% of DT, 23% of Control, and 225? of Comparisoii were reading at the 
national average for grade k or better. Chi Square analysis shows the 
proportion of CT's reading to that criteria is significantly higher than 
Comparison. No other difference- between groups was statistically significant. 

b) Average reading score: The average reading score for CT 
was 3.95; for DT it was 3.86, for Controls 3.70 and for Comparison, 3.1^U. 
No Comparisons were statistically significant. 

3- Reading and Type of Training: Grade 3 (1973) 

a) Percent above average or better. Forty three percent {hS?^) 
of CT, h3% of DT, 26% of Controls and 29% of Comparisons were reading at 
grade level or better. Tne difference between CT and Comparison was stat- 
istically significant at the ,05 level of significance. Other dif- 
ferences were not but some appr6ached that criterion (See Appendix B), 

b) Average Reading Scores: CT (3.60) was 2,5 months ahead 

of Comparison (3-35), and two months ahead of DT (3.^0) and Controls (3.39). 
No differences between groups was statistically significant. 
Discussion: Type of Reading 
The CT Group had the highest percent reading above grade level in 
Grade 5 and was equal to DT and higher than Control and Comparison in 
^Grades 3 and The probability of that occuring is .031. At grade 
three, the average nmber of the CT was 3.5 months ahead of Comparison, 
1 month ahead of Control, and equal to DT, At grade CT was 5 months 
ahead of Comparison, 2,5 months ahead of Control, and one month ahead of 
DT, At grade 5, CT was 3 months ahead of Comparison, 2 months aht-ad of 
Control, and 3 months ahead of DT. 



Palmer Page 21 

The DT Group, better than Control or Comparison at Grade 3, and still 
better than Control emd Comparison at grade k while no less than at grade 
three is no better at grade 5 than either Control on average reading score 
or percent reading at grade level or above. 

C, Reading by Social Class: 

Comparison Vitr^aa Social Class across treatment groups is shown in 
Tables IV and V. Table IV shows percent at grade level and average reading 
score, euid Table V details the relationship between Social Class and treat- 
ment (CT, DT, C) and percent at grdde level or better. 

1. Reading and Social Class: Grade 5 (1975) 

a) Percent at age level or better: Chi Square h x 2 analysis 
.reveals that the four conditions have significantly different proportions of 

subjects reading at grade level or better (p = .025). Analysis of d;>''ads 
shows the proportion in the MC to be significantly different from the Compari- 
son Groups (p = .025). No other dyad comparisons were statistically 
significant. 

The MC X CT X T-2 cell and the MC x CT x T-3 cell have the highest 
percentage reading at grade level or better of all possible cells, and the 
LC X DT X T-3 cell is the lowest. The LC x DT x T-2 group has h3% at that 
criteria, only slightly below the national norm. 

b) Average reading score: Tlie t-ratios (Appendix B) show that 
MC has significantly higher reading scores thmi IX: p = .05) and 

the Comparison Group (t-2.2i4, p = .05). No other comparisons were Significant- 
ly different. 

2. Reading and Social Class: Grade 1^ [l9lh) 

a) Percent at Grade Level or better: A Chi Square 1* x 2 analysis 

24 



Palmsr Page 22 

TABLE IV 

SOCIAL CLASS AI^D READING AT AGES 9-11 
Lower Middle 

Class Class Control Comparison 

1975: Fifth Grade (SAT) (N = 63) (N = 58) (N = 27) (N = 852) 

a) Grade Level or 35 1*8 - 36 30 
Better 

b) Average Reading i;.8o 5.53 5.09 5. 01 
Score 

197^;: hth Grade (MAT) (N = 59) (N = 60) (N = 20) 

a) Grade Level or 20 37 25 

Better 

h) Average Reading 3.62 1|. 10 3.70 3.79 

Score 

1973: 3rd Grade (MAT) (N = i;l) (N = 38)- (N = 19) 

a) % Grade Level or 26 
Better 

b) Average Reading 3.^9 3.50 3.39 3.35 
Score 



25 



Peilmer 



Page 22a 



TABLE V 

PERCENT READING AT GRADE LEV'EL, GRADE 5: 

AGE LEVEL X TYPE OF TRAINING X SOCIAL CLASS 

Concept Traininp; Discovery Control 

A. Trained at Two: 

Lower k3% 30% -33% 

Class (N = 11) (N = 8) (N = 12) 

Middle 1»2 6? hO 

Class (N = 12) (N = 6) (N = 15 ) 

B. Trained at Three: 

Lower 36 19 33 

. Class (i; = il») (N = 16) 

Middle 63 50 kO 

Class (N = ll») (N = 6) 

C. A and B 

Lower kO 29 33 

Class (N = 25) (N = 2U) 

Middle 5U 57 l+o 

Class (N = 26) (N = 12) 



I 



26 

o 

ERIC 



Palmer ^'^r.o l'l3 

reveals the proportions at grade level or better for the four conditions 
differs significantly (p^ .05). No differences existed between dyads. 

b) Average reading score: Appendix B shows that while no dyad 
comparisons yielded statistically significant differences, several approached 
the critical value. MC had the highest average reading score, I4.II, 7 
months ahead of the Comparison Group, k months ahead of Control, and 3.5 
months ahead of LC. LC bested the Comparison Group by two months. 
3- Reading and Social Class: Grade 3 (1973) 

a) Percent above grade level: The proportions reading above 
grade level did not differ significantly across or within the four groups 
in the analysis. The LC {kO%) was equal to the MC {kO%) in subjects reach- 
ing the grade level criterion. 

b) Average reading score: No differences exist by Social Class 
for the average reading scores in the groups analyzed. 

k. Discussion: Reading and Social Class 

The LC sample appears to have benefitted from their involvement in 
the study despite the lack of statistical significant comparisons. Their 
average reading score is k months better than the Comparison Group in the 
third grade (comprised of boys aiid girls, both MC and LCd, two months 
ahead of the average for a similar Comparison Group in the fourth grade 
and lage behind the Comparison Group in the 5th grade - but has a higher 
percentage reading at grade level or better. They are 2 months behind the 
Control Group in Grade 5 (both MC and LC) but are equal to that Group in 

percent reading at grade level. At grade k they are only slightly below 

I 

the Controls, for both indices; and at Grade 3, they i*ead on both indices. 



27 



Palmer Page 2k 

Not siirprisinely, the superiority of the MC group appears to in- 
crease with age. 

D. Discussion: Reading 

Because of the lack of statistical significance of -some comparisons 
between T-2 and CT on the one hand and Control and Comparison on the other 
we are not absolutely certain that those two conditions have significantly 
affected the reading scores of the children who comprise them. More fifth 
grade scores will be obtained in the next follow-up (1977) and with the 
power of additional subjects, perhaps those questions will be resolved. 

All of the data, however, points to three conclusions about reading: 
( 1 ^ Training at age two is more effective than training at ap:e three or 
no training at all. The T-2 Group is reading at grade level only slightly 
below the national norms {hQ% vs. 30%). Its average reading score for 
grade 5 is 3.5 months below that norm of 5.7, but is 3 months ahead of 
Controls, Comparisons, and even T-3. The probability chat chance would 
explain their superiority at each grade level is 15 to 1,000. At each 
grade level, T-2 exceeds all other groups in percent reading at grade 
level, aese data cannot speak to why intervention at age two is more 
effective than intervention at age three for reading level in grade 5. 

( 2 ) Concept Training is more effective than no training or discovery 
training. The CT group ib reading at grade 5 only slightly be].ov the 
national norms. Its average reading score is the highest, at every grade. 
At grade 5, the average reading score is 5.3, which looks poor- compared^ 
to the national norm, but good when compared to the Control (5.l) and 
Comparison Groups (5.0). For the boys in that Group, tliey have almost 
caught up with the average for their female peers (5.3 vs. 5.1*). 

28 



Palmer Pap.e 25 

Why those in the DT Group have lost ground with renpect to reading, 
when at i;/8 they were equal to the CT group on almost every nicn.sure ^ is 
fascinating to consider - but these data contribute little to why. 

^ 3 ) T he Lover Class children in the training; groups seem to be holdlnf^ 
their own when compared to their Control peera to include MC mid p;lrls . 
I'his statement is perhaps more speculative thou (l) and (2) above. But 
their percent reading at grade level and average scores compare favorably 
with Controls and Comparisons at every level except the fifth grade average 
score, and even there, they are equal to Controls and better than Compari- 
sons on percent reading at /^rade level. 

Beyond those conclusions, there are some significant aspects to the 

data. 

The small (N=28) sample of Control subjects found appear to be 
representative of the Comparison Group (N=852). Percent reading at grade 
5 level {36% vs. 30^) and average reading score (5. 09 to 5.01 ) are only 
slightly higher in the Controls than in the Comparisons. Considering 
the Hawthorne effect and their experience in previous assessments, one 
might have predicted a greater difference. But, perhaps Hawthorne effects 
and experience vith psychological tests are unrelated to whatever reading 
is. In any. event, the Controls appear representative of the Harlem popu- 
lation - so that generalizations from differences found betveen T-2 and 
CT and Comparisons to differences betveen those groups and the Controls 
may not be as presumptuous as the^^ might first appear. 

Finally, a note about the distributions of reading scores - those in 
the study are highly variable. 

29 



Palmer ^6 

But the standard deviation of 5th grade reading scores in our sample is 
highly rel&ted to the size of the N in each condition: 



GROUP 


N 


SD 


Comparison 


852 


1.51 




53 


1.61 


L-C 


63 




CT 


55 


1.78 


T-2 


I4O 


1.78 


DT 


38 


1.83 


MC 


58 


1.92 


Control 


28 


2.06 



for a Spearman Riink order correlation of .75. Thus, we may expect that as 
the N is increased in the sample in subsequent follow-ups, the SD of each 
group will diminish as the N gets larger. When the mean differences in 
the t-ratio remain the same, the H gets larger, and the SD gets smaller- 
the ratio gets larger and is more likely to be at a satisfactory level of 
significance. Ilius, if^ the mean differences remain the same, we may ex- 
pect more statistically significant differences between groin^s with sub- 
sequent reading scores - because with more scores, the _N gets larger and the 
SD sro.aller. Furthermore, as the sample grows with- subsequent follow-up , 
cells of the original design (e.g. LC x T-2 x CT) will become adequate in 
size for multivariate analysis, more powerful statistics for determining 
the'main effects of this study as well as interactions within those effects. 
Hopefully, we shall be able to make such statements as **for lower class 
children to read better the data indicate that condition (T-2) + (CT) is 



30 



Palmer Pn^'.e 27 

best, but for MC children either (T-2J+ (CT) or (T-3) + (CT) work equally 
well." 

IV. IQ RESULTS 

The intelligence te.st scores (IQ) scores reported below are derived 
from the Wechsler Intelligence Scale for Children (WISC). They are com- 
bined scores taken from the administration of two revisions of that measure, 
the WISC (19^9) and the WISC-R (197^). 

Ninety (90) children of the original sample were arlministei-ed the 
WISC in 197^, when they were ten years of age. Forty-nine {Ug) were ad~ 
ministered the WISC-R in 1976, when they were twelve years of age. 

The scores have been combined using corrections from the WISC-R 
to the old WISC. Those corrections were obtained from the research sec- 
tion of the Psychological Corporation. ( ) They are: For the Full 
Scale WISC IQ, 2.5 points; for the Verbal IQ, zero points; and for the 
Performance IQ, l|.^points. The corrections are made by adding 2.5 or U .6 
to the Full and Performance IQ's obtained in the 1976 assessment. The norms 
with which the scores listed below are correctly compared are the norms 
on the old -Wise. 

One reason the old WISC was revised was because the norms originally 
derived, a mean IQ of 100, changed to about 105- The revision presumably 
means the average score on the WISC-R will more closely approximate 100. 
Our transformation to old WISC scores, was made to conform more nearly to 
the norm the educational and psychological communities as well as the public 
use implicitly when hearing average IQ scores. 



31 



Palmer „ „ 

Pase 28 

Poor, Black children in the public schools have characteristically 
averaged around 91 on the old WISC nonns,, , although that average changes 
with age downward. Kius, we may expect that the average score for samples 
of such children will be lower on the WISC-R. Concerned investigators ' 
and the public at large should be informed of this change. The norms 
have been changed. The children are not dtmiber. 

Why did we choose WISC-R to administer in 1976 when we had used the 
WISC in 19TU? There are two answers. (i) In 1975, a consortium of in- 
vestigators of early childhood intervention agreed to adopt some common 
measures, one of which is the V/ISC-R. Those common measures will provide 
the consortium, the supervisory center for which is at Cornell University 
under the direction of Professor Irving Lazar, with the ability to compare 
results across studies. The Harlem study is a member of that consortium. 
(2) The second reason for adopting the WISC-R is funds are available for 
reassessing the Harlem sample found and locating additional subjects. 
In 1976-77 the 90 subjects who comprised the 1971* assessment will be evalu- 
ated again, with the WISC-R. At that time, the entire sample will have 
scores obtained from the same measure, and corrections will not be necessary. 

The 90 subjects administered the WISC in 197^ were assessed at the 
New York Medical College by Dr. Miriam John and her staff. She nor her 
staff knew which children belonged to the several cells in the research 
design, or which were treatment children and which were controls. 

The h9 subjects administered the WISC-R in 1976 were assessed at 
what was once the Harlem Training Center, now the ClfNY Center for Community 
Research and Services. Tl^e examiners (2) were in the employ of the writer, 
but had no knowledge of what subjects were in what cell of the design - 



32 



Palmer Pa^^e 29 

or, for that matter, about the design itself. 

The Wise data are presented without a Comparison Group because the 
WISC-R is relatively new, and adequate norms on a national sample are not 
yet available, particularly for Black boys. Presumably, subsequent follow- 
up study have norms available which will provide broader comparisons , 

The IQ data is presented as average scores ai'e related to age of 
training, type of training and social class. Statistical analyses are 
limited to the t~ratio as described in the analysis of the reading scores. 

Statistical analyses (t-ratio) for groups compared by average IQ 
score are in Appendix P, 

A, IQ euid Age of Training 

Table VI presents Full Scale, Verbal and Performance IQ's from the 
Wise, The former is derived by combining the scores obtained on verbal 
axid performance, 

1. Full Scale IQ: No significant difference exists between the 
average IQ of T-.2 (x=99-^0) and T-3 (x=99-36), I'he difference between 
T-3 and Control (x=99,36 vs, 93-l6) is statistically significant at the 
p^ ,05 level (t=2,00, df=3U), No statistical difference exists at a! 
satisfactory level of confidence (,05) between T-2 and Control, but that 
difference approaches that value (p^ ,10; t=1.80, df=T8; critical value =2, 
(See A, 3- , Performajice), 

2. Verbal IQ: No significant differences exist between T-.2 (l00,32), 
T-3 (99.97) and Control (96,26). 

3. Performance IQ: Both T>2 (j=2.15, dffSO) and T-3 (t=2,T8, iS>Q6) 
are significantly better than the Controls. No difference exists between 



33 



Palmer 



page 30 



ERIC 



Table IV 
IQ and Age of Training 



Trained at t • j 

24 mos. Trained at Control 

(N=53) 36 mos. group 



WISC-full 



(N= 59) (N= 27) 



X 99.40 99.36 

SO 13.79 n.78 



Verbal 

X 100.32 99.97 

SD 15.78 13,99 



34 



93.16 
16.27 



96.26 
17.89 



Performance 

97.92 98.82 90.76 

SO N-00 n.68 14.10 



Palmer . P^Se 31 

T-2 and T-3. Since no differences were found on the Verbal Scale between 
T-2 and T-3, and Control - we conclude that the Fvill Scale IQ differences 
found (A.l) are largely a function of those items of the W18C which contri- 
bute to the Performance Score. 

B. IQ and Type of Training 

Table VII presents means and standard deviations for IQ and type of 
training, 

1. Full Scale IQ: No dif ference^between CT (IOO.16) and DT (98. 3^^)- 
The difference between CT (IOO.16) and Control (93. I6) is significant 

at the p^ ,025 level of confidence (t=2.2i4, df=89). Tlie difference 
between DT and Control is not statistically significant (t=l,50, df=73). 

2. Verbal: No significant differences exist between CT (lOO), DT 
(100) and Control (96) on the verbal measure despite the four point score 
difference. 

3. Performance: CT was significantly different from Control (p<'..005). 
DT w&:; significantly different from Control (p<. .05). No difference existed 
between CT and DT. Both types of training influenced the Performance IQ 
significantly. 

C. IQ and Social Class 

Table VIII presents IQ data as a function of Social Class. 

1. Full Scale: MC differs significantly from LC (t=2.l4T, df=8i^ ) ; 
MC differs significantly from Controls (t=2.7V, df-T3). 

2. Verbal: MC (l02) is significantly higher than LC (98), p^ .05. 
LC nor MC differ significantly from the Controls. 



35 



Palmer 



Page 32 



ERIC 



Table VII 



IQ and Type of Training 



Full Scale 
X 
SD 



Concept 
Training 

(N=64) 



100.16 
12.34 



Discovery 
Training 

(N=48) 



98.34 
13.25 



Control 

Group 

(N=27) 



93.16 
16.27 



Verbal 

X 
SO 



100.36 
14.40 



99.83 
15.38 



96.26 
17.89 



Performance 
X 
SO 



99.38 
12.22 



97.07 
13.51 



90.76 
14.10 



36 



Palmer 



Page 33 



Table VIII 
IQ and Social Class 

Participating 



Lower Middle Control 

(N=53) (N=59) (N=27) 

Full Scale 

X 96.31 102.14 93.16 

SD 11.73 13.02 16.27 

Verbal 

X 97.62 102.39 . 96.26 

• SD 13.99 15.19 17.89 

Performance 

X 95.79 101.08 90.76 

SD 12.18 12.48 14.10 



37 



Palmer _ 

Paf;e 3'» 

3. Performance: MC (lOl) is significantly higher than LC (96), 
.025. MC is significantly higher than Controls (91), p^-. .005. LC 
(96) approaches significance at the .05 level when compared to the Controls 
(t=i. 66). 

IQ: DISCUSSION 

Both ages of training (T-2 01- T-3) affected full IQ significantly. 
When compared to the Control Group^ Concept Training but not Discovery Training 

affected IQ significantly as well. But the data are clear with respect 

to the- subscale of the WISC which contributed to that IQ difference. 
There were no significant differences between groups for age or type of 
training on the Verbal Scale, but the differences on the Performance 
Scale are consistent and impressive. The respective IQ's reflecting that 
scale were T-2=98, T-3=99, 01^=99 and DT=9T, all of which are significantly 
higher than the Control Performance IQ of 91. The domain of behavior 
which the items on the Performance Scale measure is the domain influenced 
by our early intervention. 

There were no significant differences at age l*/8 in favor of type 
or age of training as compared to Control. The fact that the sample size 
WOE almost twice as large at k/8 makes the argument for a "sleeper effect" 
perauasive - on the same test (WISC) given at i4/8 and at ages 10 and 12 - 
no differences occurred at h/Q but were found later. 

While the IQ evidence shows no differences by age of training at 
this time - there is still some suggestion that T-2 is more effective 
than T-3. The T-2 Stanford Binet IQ average at 2/8, after training , was 



38 



Palmer 

93. A year later, at 3/8, it vao 97. The T-3 uvt?rat:c at 3/0, bo fore 
training , was 96. At 3/8, after training, it was 99. 'Huis , T-P chan^^ed 
more. T-3 begaii with higher TQ's, but T-2 is higher now - hut not signifi- 
cantly so. 

The original IQ advantage at 3/0 for T-2 is also relevant in interpret- 
ing results related to type of training. The T-2 group was comprised of 
half "DT ' s and half "CT ' s , as was the T-3 Group. Thus, even if T-3 had 
continued to excede T-2 at age 12, the early IQ advantages of T-3 would not 
influence the type of treatment results. 

Both types of training influenced IQ. The major influence was on 
whatever it is that the Performance Scale measures, rather than the Verbal 



EKLC 



Scale. There are nine IQ points difference on the Performance Scale be- 
tween Concept Training etnd Control I 

V. General Discussion and Conclusions 

The data are persuasi\*e that at least one form of intervention, 
Concept Training, at age 2k or 36 months signif icemtly affects reading the . 
the fifth grade and IQ at age 10-''12. Intervention at age two had an effect 
on reading and IQ - whereas intervention at age three affected IQ, but not 
reading. It is also clear that Discovery Training affected IQ, but did not 
affect reading. 

The evidence on the IQ is conclusive and illuminating. The IQ's 
were affected because of what the Performance Scale measures, which is 
different from what the Verbal Scale measures I-t cannot be said that the 
data are statistically significant "but not really significant". Nine IQ 
points on the Performance Scale, 7 IQ points on the Full Scale for the 



39 



Palmer Page 36 

6U CT. children is significant, in this case, regardless of the connotation 
o.t the word significant, that for Bn introduction two hours weekly for 
eight months, ten years ago. 

The evidence for the effect on reading is conclusive only if one is 
persuaded by two arguments: (l) that the Comparison Group is a valid 
Control Group - equal in Social Class and intellective potential to the 
Controls involved in the study and without the possible advantages of 
prior testing and the Hawthorne effect and (2) that the combined effect 
of many t-ratios at or approaching statistical significance for the T-2 
and CT groups and the probability of those groups leading all other groups 
in average reading score at all three grade levels (p = .015) provide a 
level of statistical confidence which no single t-ratio does provide. 

T\xe Comparison Group include^: c.,11 boys in the fifth grade in District 
5 (Harlem) whose naines on class rosters left no doubt as to gender. The 
original Control Group was comprised of 35% Category V; 3W Category IV; 
7/S Category III; Categories II and I on the Hollirigshead-Redlich, Har- 
lem, despite its reputation, is not all ghetto « many Black middle class 
families attend school there. Presently, the unemployment rate in Harlem 
is close to 30^* A larger proportion of our original sample were unem- 
ployed when the study began. T\\e attrition analysis shows that the 



40 



Palmer 

sample fovmd and assessed in 1976 did not differ by Social Class from those 
in the original sample who were not found and assessed (Appendix A,Il), Jf 
anything, it appears that the sample in this study was somewhat lower in 
SES them the Harlem population of fifth grade boys of which it is a subset. 

the cumulative effect of repeated measures each ol ^'hich is Just 
short of the .05 level of confidence is best demonstrated in Appendix B, 
I,- B, 2a. The U X 2 Chi Square in 7-08, 6.26, and 6.55 for grades 5 , 
and 3, respectively. The critical value for .05 with three degrees of frce- 



dom-is__7-..8-, — Al-l"t-hree-are"at'"erp^ ^,10 level of confidence. It is argued 
that those proportions do differ across groups at a satisfactory level of 
statistic . confidence. If the Concept Training Group had not had the 
highest average score at each grade, we might conclude that the distributions 
differ "but be xincertain as to what group read best. Bnt the chance that 
CT would read bes-u at every age given four possible winners, ir, .015. 



The t-ratios between T-2 ejid CT on the one hand and Control and 

) 

_^are consisi 

comparison. For those to reach a satisfactory level of confidence;, three 



Comparison Grcups>^are consistently in i'avor of the former groups on every 



events must occur as additional subjects are foiind and assessed: (l) The 
sample size must increase. (2) The standard deviations must decrease, 
and (3) the means must remain the Si.rne or increase. 

Additional subjects assf^: 'ed is synonomous with the sample size 
increasing. The SD will almost c *:tainly decrcnae aC the N increases: (a) 
within groups the rho between sample sizes and,SD*s is -.75; (b) the 
SD for the 852 boy Comparison Group is 1.51, for the Controls it in 2.02; 
CT is 1.78;^ and T-2 is 2.02. The sample, in tlie litudy is a subset of the 



41 



Palmer Page 38 

Comparison Group. The only possibility which can prevent increased stat- 
istical confidence with additional subjects is that the difference between 
reans decreases - that, of course, is what the t-ratio states. Presently, 
cur distributions are so variable that the mean for each group may change 
with additional subjects, in either direction. 

If one is not persuaded by these arguments that the combined effects 
of the reading analysis are conclusive about the superiority of the T-2 
and CT groups - what c »n 0^^ pnnnind p-2 — Cer^tainly--not~-thatr-the-interventioh^ 
has failed. Only that more subjects should be assessed before such con- 
clusiveness is shown, and we are continuing to locate and assess subjects 
presently not found. 

For those who are persuaded the evidence is conclusive and those 
who grant tliat a good case has been made but have reservations about the 
data being conclusive, there are some important questions raised by the 
data. 

Why is Concept Training so much more effective than Discovery 
Training for subsequent reading, when the latter has an equal effect on tlie 
IQ? Indeed, at grades 5 and ^, the DT group is no better than Control or 
Comparison on that skill. It would be extremely difficult to argue the 
DT had on^ effect on reading. And yet, the DT treatment is the most similar 
to what occurs in most nursery schools and day care centers today. While 
thoao dat.a refer onJy to treatments characterized by tlie one-to-one situa- 
tion, it does not seem to be too great d generalization to apply the re- 
sults to Group Training as w^ll. The suggestion is that most preschool 
training today may well have an effect on IQ scores, but none on reading. 



42 



Palmer -^'^ 

The suggestion is that early intervention programs should include one-to- 
one interaction between teacher and child - and that the training should 
be structured, planned, and teacher-guided. 

Why is reading more effective by training at age two than training 
at three, where IQ scores for T~2 and T--3 are almost identical? Clearly, 
these results tend to support those who have argued for intervention to 
occ\ir as eai^ly as possible, and raise questions about the position that 
intervention at any age is equally effective. But one can only specua.ate 
about why. And, this report iv, not the place for that. 

Why were there no differences by age or type of training at age 1^/8 
when training at two and Concept Training yields increased IQ's and reading 
scores at age 12? The existence of sleeper effects must be retained as 
one possibility. The reading results, it may be argued, are there at 12 
and not at ^/8 because no measure was given at k/Q valid for whatever 
reading is. But the IQ's are different at 12 and were not at k/Q - 
the WPSSI was used at ^4/8 and the WISC at 12^ • Those measures gained 
part of their utility because the Verbal and Performance scales are com- 
prised of the same kinds of items, administered in the same manner, and 
conceived by the same expert in test construction. 

Why is the Performance IQ and not the Verbal IQ affected by the treat- 
ments? Only conjectvire is possible at this time, so it will not be discussed 
But the question deserves serious attention by those concerned with early 
intervention. i 

How is it possible that two hours veekly of Concept Ti-aining at age 
two or three for a period of eight months, w5.th the average number of 



43 



Palmer ^age ^40 

hours in training being only k3 9 affected Performance IQ nine points and 
average reading score three months? More speculation, but it did. 

And, finally, there is that word significance. At least two connota 
tions of the word €u:e relevant. Statistical significance, which has been 
discussed; and significance for the children concerned. Is it significant 
that 63 found children have nine points higher Performance IQ's and read 
three months aliead - and that we may infer that 60 others not found do 
so as well? We think it is. 

VI . RECOMMENDATIONS 

On the basis of the evidence presented here, it is recommended that 
early childhood compensator^' education programs: 

(1) Begin at an earlier- age than is presently so, and, 

(2) That all programs have imbedded in them periods during 
each week when every child receives structured, planned, 
teacher-guided and concept oriented one-to-one instruction. 



44 



Palmer 



Par.e )il 



REFERENCES 

Dcppelt, J. E. and Kaufman, A. A. Estimation of the Differences Between 
WISC-R and WISC IQ's. Journal of Educational and Psychological 
Measurement , In Press. 

Hollingshead, A. B. The T^/o Factor Index of Social Position, New Haven, 
Conn. , Author, 1957 

Palmer, F. JFi. Socioeconomic Status and Intellective Performance Among 
Negro Preschool Boys. Developmental Psychology , 1970, _3, 1-9. 

Palmer, F. H. , Minimal Intervention Vt Age Two and lliree nnd Subi^equent 
Intellective Changes.- In R. K. Parker, The Preschool in Action , 
Boston: Allyn and Bacon, 1972 



45 



APPENDIX A 



46 

o 

ERIC 



ERIC 



Palmer 

APPENDIX A 

I. Attrition analysis by 3.8 Stanford-Binet IQ^ 

19 7I; _ 1976 
Found Not Found 

^1 . ^2 t d.f 



A-l 



T-2 


97.2 


96.3 


.32 


100 


T-3 


101.3 


. 97.0 


1.72 


108 


Cont. 


91.3 


95.3 . 


-1.13 


58 


CT 


99.h 


95.8 


l.kl 


107 


DT 


98.0 


98.1 


- .02 


102 


L 


97.5 


95.5 


.81 


120 


M 


101 «0 


100.2 


.28 


93 



II. Attrition analysis by Hollingshead-Redlich SES Score^ 

197I+ _ 1976 





Found 


Not Found 








^1 














t_ 


d.f 


T-2 


58.9 


57.6 


.HQ 


113 


T-3 


56.9 


57.1 


- .11 


120 


Cont. 


58.0 


58.3 


- .11 


66 


CT 


57. it 


58. li • 


- .ko 


118 


DT 


59.5 


56.7 


-1.1k 


118 



a. None of the above comparisons were signJficani; at p 5? .O7, tvo-talled. 



47 



Palmer 



III, Attrition "by Hollingshead SES Categories^ 





{If 


in 197I+-76 


sample/ H in 


1966-7 sample) 






I 


II 


III 


IV 


V 


Scores 


11 - 17 


18 - 27 


28 - 1+8 


- 60 


61 - 77 


T-.2 


0/1 


1/6 


1/5 


23/1*2 


28/69 




0/0 


0/3 


6/13 , 


29/1*2 


31/66 


Cont, 


0/1 


2/3 


2/5 


13/23 


16/36 



a. Scores and categories range from V (61-77) as the lowest SES to 1 
(11-17) as the highest. 



48 



APPENDIX B 



I 



49 

o 

ERIC 



Palmer 

Appendix B: Ana.lysis of 1975 Assessment 

I, Reading 

A. Reading as a function of age at training 



B-1 



50 



1. t-ratios < 


on 


individual reading scores 




a . Grade 


5 


( CAT" ^ 

(bAl ) 
















ii 


A 




d.f . 


T-.2 - T-3 








5.02 


.m 


91- 


T-2 - Cont . 






5.35 


5.09 


.52 


67 


T-2 - Conip. 






5.35 


5.01 


1.23 


. 890 


T-3 - Cont. 






5.02 


5.09 


- .17 


80 


T-3 - Comp. 






5.02 


5.01 


.01* 


903 


Cont. - Comp. 








^ m 
J . Ul 


.24 


879 


b . Grade 


h 


(mat) 










T-2 - T-3 






1+.06 


3.79 


.88 


87 


T-2 - Cont. 






1+.06 


3.70 


1.00 


68 


T-2 - Comp. 






1+.06 


3.79 


1.31 


213I+ 


T-3 - Cont. 






3.79 


3.70 


.27 


77 


T-3 - Comp. 






3.79 


3.79 


.00 


21I+3 


Cont. - Comp. 






3.70 


3.79 


- .38 


2121* 


c . Grade 


3 


(MAT) 










T-2 - T~3 






3.6o 


3.1*0 


.79 


77 


T-2 - Cont. 






3.6o 


3.39 


.66 


53 


T-2 - Comp. 






3.6o 


3.35 


l.?.U 


219^* 


T-3 Cont. 






3. ho 


3.39 


.03 


60 


T-3 - Comp. 






3. ho 


3.35 


.27 


2201 


Comp. - Cont. 






3.39 


3.35 


.Ih 


2177 



ERIC 



Palmer B-2 



2. Chi-square statistics on differences in proportion of chil- 
dren reading at or above the national norm. 

a. 2 X h Chi-Square Statistics 



Fifth Grade: 


T-2 


T-3 


Cont. 


Coiop . 


Above Norm 


19 


21 


10 


262 


Below Norm 


21 


32 


18 


590 


Fourth Grade : 


T-2 


T-3 


Cont. 


Conrp . 


1 

Above Norm 


! 

12 


13 


T 


379 


Below Ilorm 


28 


36 


23 


1717 



= 6.6i 
n. s. 



X = 6.29 
n. s . 



iii. Third Grade; 
Above Norm 
Below Norm 



T-2 


T-3 


Cont . 


Comp. 


. IT _ 


17 


5 


6li2 


19 


26 


Ih 


1518 



X = 7.10 
n. s. 



I 



ERIC 



51 



Palmer 



b. 2x2 Chi-square statistics 



i. Grade 5: 



T-2 T-3 Cont. Comp. 

T-2 0.58 0.9'+ I*. 97* 

T-3 0.12 1.83 

Cont. 0.31 

Comp. 



ii. Grade h: 



T-2 T-3 Cont. Comp. 

T-2 0.13 0.39 3.73 

T-3 0.10 2.29 

Cont. .55 



Comp. 

iii. Grade 3: 



T-2 T-3 Cont. Comp. 

T-2 O.i*- 2.26 5.16** 

T-3 1.01 1.9'+ 

Cont. 0.10 
Comp. 



* P<.05 
** p<.03 



ERIC 



52 



Palmer 



B. Reading as a functiou of type of training 
1. t-tests on individual reading scores 
a. Grade 5 (SAT) 





1 


2 


t 


d.f . 


CT - DT 


5.29 


h.91 


.oh 


91 


CT - Cont. 


5.29 


5.09 


.46 




CT - Corap. 


5.29 


5.01 


-1 -1 0 

l.lo 


905 








. - .25 


65 


-DT - Colli. 


U.9T" 


5.09 










888 


Cont. - Comp. 


5.09 


5.01 


+ .2h 


879 


b. Grade h 


(MAT) 








CT - DT 


3.95 


3.86 


.29 


87 


CT - Cont. 


3.95 


3.70 


.75 


79 


CT - Comp. 


3.95 


3,79 


.87 


21I45 


DT - Cont. 


3.86 


3.70 


.1+3 


66 


DT - Cnmn 


3.86 


3.79 


. 33 


2132 


Comp. - Cont . 


3.70 


3.79 


- .38 


212I+ 


c. Grade 3 


(MAT) 








CT - DT 


3.I49 


3.50 


- .OI4 


77 


CT - Cont. 


3.I49 


3.39 


.32 


66 


CT - Comp. 


3.U9 


3.35 


.81 


2207 


DT - Cont. 


3.50 


3.39 


.33 


1+7 


DT - Comp. 


3.50 


3.35 


.68 


2186 


Cont. - Comp. 


3.39 


3.35 


.ih 


2177 



ERIC 



53 



Palmer 



B-5 



2. Chi-square statistics on differences in proportion of chilr 
dren reading at or above the national norm. 

a. 2 X ^4 Chi-Square Statistics 



i. Fifth 


Grade : 


DT 


CT 


Cont. 


Comp. 


Above 


Norm 


111 


26 


10 


262 


Below 


Norm 


2h. 


29 


18 


590 



= 7.08 
n. s. 



ii. Fpurth Grade: 


DT 


CT 


Cont . 


Comp . 


Above Norm 


10 


15 


7 


379 


Below Norm 


28 


36 


23 


iTn 



= 6.26 
n. s. 



iii. Third Grade: 



DT 



CT 



Above 


Norm 


13 


21 


5 


Below 


Norm 


17 


28 


11* 



Cont. Comp. 



61i2 



1518 



X' = 6.55 
n . s . 



54 



Palmer 



b. 2x2 Chi-square statistics 



i. Grade 5' 



ii. Grade U: 



ill. Grade 3: 



* P<-05 
p<.025 



CT DT Cont. Comp. 

CT 1.00 1.01 6.^1^^ 

DT 0.01 0.63 

Cont. 0.31 
Comp. 



CT DT Cont. Comp. 

CT .10 .35 ^.27* 

DT .08 1,70 

Cont. .55 
Comp. 



. CT DT Cont. Comp, 

CT 0.00 1.59 3.91^* 

DT I.U5 .2.61 

Cont. 0.10 
Comp. 



ERIC 



Palmer 



3-7 



C. Reading as a func+ :ol of SES 



1. t-tesos on : 


iiriivi dual 


reading 


scores 




a. Grad ' 












^1 


A 


t 


d.f. 


L ^ M 




5.5li 


-1.86 ^ 


91 


L - Cont. 


U.85 


5.09 


- .57 


78 


L — Conip • 


H . Op 


^ »01 


- . DO 


901 


— uonip • 


5 . 5^ 

c c), 

5 . 5^ 


5 .09 
5.01 


.95 
1.97 


69 

892 


^on b • — uonip . 


5 .09 


5 .01 


.2k 


879 


b. Grade U 


(MAT) 








L - M 


3.70 


li.l2 


-1.37 


87 


L - Cont. 


3.70 


3.70 


.00 


72 


L - CoTnp, 


3.70 


3.79 


- M 


2138 


M - Cont. 


h.l2 


3.70 


1.26 


73 


M - Comp. 


h.l2 


3.79 


1.70 


2139 


Cont. - Conip. 


3.70 


3.79 


- .38 


212U 


c. Grade 3 


(MAT) 








L - M 


3.52 


3.1i6 


.25 


77 


L Cont . 


3.52 


3.39 


.1*0 


58 


L - Conip. 


3.52 


3.35 


.90 


2199 


M - Cont. 


3.h6 


3.39 


.23 


55 


M - CcTnp. 


3.1*6 


3.35 


.56 


2196 


Cont. - Comp. 


3.39 


3.35 


.lU 


2177 



* p< .05 

P < .025 
a One-tailed 

56 



ERIC 



Palmer 



B-8 



2, Chi-squaro statistic for differences in proportion of chil- 
dren reading at or above the nationct.! norm, 

a, 2 X U Chi-Square Gtatistics 



11, 



.111, 



I'lltn (jrracle : 


T 
Jj 


M 
1*1 


p -.-.4. 

L»on u • 


Comp « 






iiaove iiorifl 




cc 




PfiP 




8.96 


Belov Norm 


33 


20 


19 


590 


P<. 


05 


Fovurbh Grade : 


L 


■ M 


Cont. 


(bmp. 






Above Norm 


9 


i6 


7 


379 






Below Norm 


35 


29 


23 


1717 


P<. 


025 


Third Grade: 


L 


M 


Cont . 


Corap. 






Above Norm 


18 




5 


6U2 




6.58 


Below Norm 


23 


22 




1518 


n. 


s . 



57 

o 

ERIC 



Palmer 



b. 2x2 Chi-square statistics 



ERIC 



i • Grade 5 : 



ii. Grade k: 



iii. Grade 3: 



* P<.05 
p < . 01 



L M Cont . Comp . 

L 2.7h 0.01 0.U6 

M 2.22 
Cont. o.l8 
Comp. 



L M Cont. Comp. 

L 2.51 .09 .16 

Cont. 0.55 
Comp. 



L M Cont. Comp. 

L .03 1.70 3.85^ 

M 1.36 2.73 

Cont. 0.10 
Comp. 



58 



Palmer 



B-IO 



II. IQ 

A. Wise fun.l scale scores 

1. t- tests between treatment groux)S 







1 

L 




4- 

X 


□. . X . 




99. 


.1*0 


99.36 


.02 


110 


T-2 - Cont. 


99. 


.ko 


93.16 


1.80* 


78 


T-3 - Cont. 


99. 


,36 


93.16 


2.00** 


8l» 


CT - DT 


100, 


,16 


98.3l» 


.75 


no 


CT - Cont. 


100. 


,16 


93.16 


2.2l»++ 




DT - Cont. 


98. 


.3k 


93.16 


1.50 


73 


L - M 


96. 


■ 31 


102. ll» 


-2.UT++ 


111 


L - Cont. 


96.31 


93.16 


.99 


78 


M Cont. 


102. 


.Ik 


93.16 


2.t1*++ 


81< 



* P<: -05, One-tailed 

P < .025, One-tailed 
++ p < .01, One-tailed 



59 



Palmer 



B-11 



B. Wise verbal scale score 

1. t-tests between treatment groups 





^1. 


A 


t 


d.f . 




100. 32 


99.97 


.13 


110 


T-2 Cont. 


100.32 


99.26 


l.Olj 


78 


T-3 - Cont. 


99. 9T 


96.26 


l.OU 


81i 


CT - DT 


100.36 


99.83 


.19 


110 


CT - Cont. 


100.36 


96.26 


1.53 


. 89 


DT - Cont. 


99.83 


96.26 


.98 


73 


L - M 


97.62 


102.39 


-1 . 72* 


110 


L - Cont. 


97.62 


96.26 


.37 


78 


M- Cont. 


102.39 


96.26 


I.6I1 


81i 



* p<.05. One-tailed 



60 



Palmer 



C. Wise performance scale score 

1. t-tests "between treatment groups 





^1 

± 


d 


4. 

U 


CL • X < 


T-2 - T-3 


97.92 


98.82 


.37 


110 


T-2 - Cont. 


97.92 


90.76 


2.16** 


78 


T-3 - Cont. 


98.82 


90.76 


2.78++ 


8l» 


CT - DT 


99.38 


97.07 


.95 


110 


CT - Cont. 


99.38 


90.76 


2.9^++ 


89 


DT - Cont. 


97.07 


90.76 


1.91* 


73 


L - M 


95.79 


101.08 


-2.27** 


110 


L - Cont. 


95.79 


90.76 


1.66 


78 


M - Cont. 


101.08 


90.76 


3.U2++ 


Qk 



* P<^.05 
** P<.025 
++ .005 



61 

o 

ERIC 



Palmer 

III. Descriptive statistics 

A. Age 10-12 (Wise) 

CT 

DT 

T-2 

T-3 

L 

M 

Cont. 

B. Grade 5 (SAT) 

CT 
DT 

1-2 
T-3 
L 
M 

Cont. 
Comp. 



B-13 



N 


X 


S. D. 


6k 


100.16 


l?..3h 


kQ 


. 98. 3i^ 


13.25 


53 


99.1»0 


13.79 


59 


99.36 


11.78 


53 


96.31 


11.73 


59 


102.11+ 


13.02 


27 


93.16 


16.27 


55 


5.29 


1.78 


■8 


1*.97 


1.83 


ko 


5.35 


2.02 


53 


5.02 


1.61 


51 


It. 85 


1.66 


k2 


5.5U 


1.91 


29 


5.09 


2.06 


852 


5.01 


1.69 



ERIC 



62 



Palmer 



C, Grade U 

CT 
DT 
T-2 
T-3 

M 

Cont. 
Comp. 

Grade 3 

CT 

DT 

T-2 

T-B 

L 

M 

Cont. 
Comp. 





V 
A 


b * JJ . 


51 


3.95 




3o 


3- Ob 


1. 5>o 


u: 


U.06 


1.U6 




3.79 


l.Ui 




3.70 


1.52 




U.12 


1.33 


30 


3.70 


1.53 


2096 


5.79 


1.29 


lin 
49 


3. ^9 


1 . 14 




3. ?u 




36 


3.60 


1.09 


H3 


3. Ho 


1.15 


Ui 


3.52 


1.17 


38 


3.H6 


1.08 


19 


3.39 


1.16 


2160 


3.35 


1.20 



63 



